Low cost video compression using fast, modified z-coding of wavelet pyramids

ABSTRACT

An entropy efficient video coder for wavelet pyramids approaches the entropy-limited coding rate of video wavelet pyramids, is fast in both hardware and software implementations, and has low complexity (no multiplies) for use in ASICs. It uses a modified Z-coder to code the zero/non-zero significance function and Huffman coding for the non-zero coefficients themselves. The encoding unit includes a significance function generator that receives coefficients and outputs a single significance bit. A zero coefficient eliminator receives coefficients in parallel with the significance function generator and outputs coefficients if non-zero. Output from the significance function generator is coded using the modified Z-coder. Output from the zero coefficient eliminator is coded using Huffman coding. Both outputs are combined to form the resulting compressed stream. The modified Z-coder is similar to a standard Z-coder but uses a different technique for the LPS (least probable symbol) case during encoding and decoding that results in a Z-coder that functions appropriately.

CROSS-REFERENCE TO RELATED APPLICATION(S)

This application is a continuation of U.S. patent application Ser. No.11/289,862, entitled LOW COST VIDEO COMPRESSION USING FAST, MODIFIEDZ-CODING OF WAVELET PYRAMIDS filed Nov. 29, 2005 which is incorporatedherein by reference for all purposes, and which is a continuation ofU.S. patent application Ser. No. 10/397,663, entitled LOW COST VIDEOCOMPRESSION USING FAST, MODIFIED Z-CODING OF WAVELET PYRAMIDS filed Mar.25, 2003 (now U.S. Pat. No. 7,016,416) which is incorporated herein byreference for all purposes, and which is a continuation of U.S. patentapplication Ser. No. 09/444,226, entitled LOW COST VIDEO COMPRESSIONUSING FAST, MODIFIED Z-CODING OF WAVELET PYRAMIDS filed Nov. 19, 1999(now U.S. Pat. No. 6,570,924) which is incorporated herein by referencefor all purposes, and which claims the benefit of U.S. ProvisionalApplication No. 60/109,323, entitled FAST, MODIFIED Z-CODING OF WAVELETPYRAMIDS filed Nov. 20, 1998 which is incorporated herein by referencefor all purposes.

TECHNICAL FIELD

The present invention relates generally to compression and decompressionof data. More specifically, the present invention relates to a fast,low-complexity video coder/decoder.

BACKGROUND

A number of important applications in image processing require a verylow cost, fast and good quality video codec (coder/decoder)implementation that achieves a good compression ratio. In particular, alow cost and fast implementation is desirable for low bit rate videoapplications such as video cassette recorders (VCRs), cable television,cameras, set-top boxes and other consumer devices. In particular, it isoften desirable for such a codec to be implemented on a low-cost,relatively small, single integrated circuit.

In general, an image transform codec consists of three steps: 1) areversible transform, often linear, of the pixels for the purpose ofdecorrelation, 2) quantization of the transform values, and 3) entropycoding of the quantized transform coefficients. In general, a fast, lowcost codec is desirable that would operate on any string of symbols(bits, for example) and not necessarily those produced as part of animage transform. For purposes of illustration, though, and for ease ofunderstanding by the reader, a background is discussed in the context ofcompression of video images, although the applicability of the inventionis not so limited.

A brief background on video images will now be described. FIG. 1illustrates a prior art image representation scheme that uses pixels,scan lines, stripes and blocks. Frame 12 represents a still imageproduced from any of a variety of sources such as a video camera, atelevision, a computer monitor etc. In an imaging system whereprogressive scan is used each image 12 is a frame. In systems whereinterlaced scan is used, each image 12 represents a field ofinformation. Image 12 may also represent other breakdowns of a stillimage depending upon the type of scanning being used. Information inframe 12 is represented by any number of pixels 14. Each pixel in turnrepresents digitized information and is often represented by 8 bits,although each pixel may be represented by any number of bits.

Each scan line 16 includes any number of pixels 14, thereby representinga horizontal line of information within frame 12. Typically, groups of 8horizontal scan lines are organized into a stripe 18. A block ofinformation 20 is one stripe high by a certain number of pixels wide.For example, depending upon the standard being used, a block may be 8×8pixels, 8×32 pixels, or any other in size. In this fashion, an image isbroken down into blocks and these blocks are then transmitted,compressed, processed or otherwise manipulated depending upon theapplication. In NTSC video (a television standard using interlacedscan), for example, a field of information appears every 60th of asecond, a frame (including 2 fields) appears every 30th of a second andthe continuous presentation of frames of information produce a picture.On a computer monitor using progressive scan, a frame of information isrefreshed on the screen every 30th of a second to produce the displayseen by a user.

As mentioned earlier, compression of such video images (for example)involves transformation, quantization and encoding. Many prior artencoding techniques are well-known, including arithmetic coding.Arithmetic coding is extremely effective and achieves nearly the highestcompression but at a cost. Arithmetic coding is computational intensiveand requires multipliers when implemented in hardware (more gatesneeded) and runs longer when implemented in software. As such, codersthat only perform shifts and adds without multiplication are oftendesirable for implementation in hardware.

One such coder is the Z-coder, described in The Z-Coder Adaptive Coder,L. Bottou, P. G. Howard, and Y. Bengio, Proceedings of the DataCompression Conference, pp. 13-22, Snowbird, Utah, March 1998. TheZ-coder described achieves high compression without the use ofmultipliers. Although the Z-coder described in the above paper has thepromise to be an effective codec, it may not perform as well asdescribed.

Therefore, a compression technique for data in general and for video inparticular is desirable which may be implemented in hardware of modestsize and very low cost. It would be further desirable for such acompression technique to take advantage of the benefits provided by theZ-coder.

SUMMARY

To achieve the foregoing, and in accordance with the purpose of thepresent invention, a modified Z-coder is disclosed that achieves lowcost, fast compression and decompression of data.

A fast, low-complexity, entropy efficient video coder for waveletpyramids is described, although the invention is not limited to videocompression nor to a transform using wavelets. This coder approaches theentropy-limited coding rate of video wavelet pyramids, is fast in bothhardware and software implementations, and has low complexity (nomultiplies) for use in ASICs. It uses a modified Z-coder to code thezero/non-zero significance function and Huffman coding for the non-zerocoefficients themselves. Adaptation is not required. There is a strongspeed-memory trade-off for the Huffman tables allowing the coder to becustomized to a variety of platform parameters.

The present invention is implementable in a small amount of siliconarea, at a modest cost in coding efficiency. With only 15% of thecoefficients requiring coding of the coefficient value, speed andefficiency in identifying that minority of values via the significancefunction is an important step. The average run of correct prediction ofsignificance values is about 20, so efficient run coding is important.While the importance of the 3 bits of context and the asymmetry stronglyindicates the use of an arithmetic coder, an arithmetic coder can be toocostly.

The requirement for a fast algorithm implementable in minimal siliconarea demands that something other than a traditional arithmetic coder beused. In particular, multiplies are to be avoided as they are veryexpensive in silicon area. The modified Z-coder presented hereinprovides a codec that avoids multiplies, provides very good compressionand functions appropriately to encode and decode bit streams.

Another advantage of the modified Z-coder is its simplicity and speed inview of hardware implementation. In one embodiment in softwarenon-optimized for speed, the modified Z-coder is several orders ofmagnitude faster than the commercial (well optimized) MPEG2 softwareencoder used for the same quality. An optimized modified Z-coder shouldachieve 20-30 times improvement in performance with respect to MPEG2.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention, together with further advantages thereof, may best beunderstood by reference to the following description taken inconjunction with the accompanying drawings in which:

FIG. 1 illustrates a prior art image representation scheme.

FIG. 2 illustrates a prior art technique for compression of a videostream.

FIG. 3 is a block diagram of a system for compressing video imagesaccording to one embodiment of the invention.

FIG. 4 is a graph of the probability of LPS versus Delta.

FIG. 5 is a flowchart describing a modified Z-encoder according to oneembodiment of the invention.

FIG. 6 is a flowchart describing a modified Z-decoder according to oneembodiment of the invention.

FIGS. 7 and 8 illustrate a computer system suitable for implementingembodiments of the present invention.

DETAILED DESCRIPTION

As previously mentioned, and as shown in FIG. 2, an image transformcodec typically includes three steps: 1) a reversible transform, oftenlinear, of the pixels for the purpose of decorrelation, 2) quantizationof the transform values, and 3) entropy coding of the quantizedtransform coefficients. The present invention describes an entropy codecwhich is fast, efficient in silicon area, coding-wise efficient, andpractical when the transform is a wavelet pyramid. Although the presentinvention is presented herein in the context of image compression usinga wavelet transform, the present invention is applicable for encoding ofany suitable bit stream, and not necessarily a bit stream from an imagenor for use necessarily with a wavelet transform.

FIG. 2 illustrates a prior art technique for compression of a videostream. Step 52 receives the pixels from a video image and performs atransform. Any linear or non-linear transform may be used includingwavelet, DCT, fractal, etc. The coefficients produced from the transformare then input to step 54 where quantization is performed, an optionalstep. Quantization is well-known in the art and any suitable quantizermay be used. Next, the quantized coefficients are input to an encoder instep 56 where the coefficients are encoded for further compression. Anysuitable known encoding algorithm may be used. The output from theencoder is the compressed video stream. Decompression of the compressedvideo stream is a reverse of compression.

Wavelet Pyramid Embodiment

One embodiment of the invention for use on video uses quantized waveletpyramids derived from NTSC video quantized to be viewed under standardconditions. These video pyramids have substantial runs of zeros and alsosubstantial runs of non-zeros. In this embodiment, a modification of theZ-codec is developed and applied to code zero vs. non-zero in quantizedvideo pyramids. Z-codecs have the advantage of a simple (no multiplies)and fast implementation combined with coding performance approximatingthat of an arithmetic codec. The modified Z-coder implementationdescribed herein approximates an adaptive binary arithmetic coder usingdyadic broken line approximation. It has a very short “fastpath” and isattractive for application to a wavelet significance function. Thenon-zero coefficients of the pyramid are coded(coefficient-by-coefficient) reasonably efficiently with standardHuffman coding.

A wavelet transform is known in the art, and a specific use of a wavelettransform on an image to be compressed is described in U.S. patentapplication Ser. No. 09/079,101 which is incorporated by reference. Avariation on a wavelet transform for image compression is described inU.S. patent application Ser. No. 09/087,449 which is also incorporatedby reference.

The typical wavelet pyramid acts as a filter bank separating an imageinto subbands each covering approximately one octave (factor of 2). Ateach octave typically there are three subbands corresponding tohorizontal, vertical, and checkerboard features. Pyramids are typicallythree to five levels deep, covering the same number of octaves.

If the original image is at all smooth (images typically have a Höldercoefficient of ⅔ meaning roughly that the image has ⅔ of a derivative)the magnitude of the wavelet coefficients decreases rapidly. If thewavelet coefficients are arranged in descending order of absolute value,those absolute values will be seen to decrease as N^(−s) where N is theposition in the sequence and s is the smoothness of the image. Thewavelet pyramid is further sharpened if the wavelet pyramid is scaled tomatch the characteristics of the human visual system (HVS). Preferably,fewer bits are used in the chroma subbands.

This particular embodiment considers wavelet pyramids drawn frominterlaced video consisting of fields of 240×640 pixels. A frameconsists of two interlaced fields and is 480x640 pixels. A standardviewing condition is to view such video from six picture heights away sothat each pixel subtends 1/(480×6) radians or about (1/48)°. There aretherefore 24 pixel pairs (cycles)/° in both the horizontal and verticaldirections.

After forming the wavelet pyramid, the wavelet coefficients are scaled(quantized) consistent with the viewing conditions above and the HVScontrast sensitivity function. Each block has the coefficients arrangedby subband, with the coarsest subbands first and the finest subbandlast. Each subband is scanned out video-wise, by row, left to right,from the top row to the bottom row. Thus the magnitude of thecoefficients decrease significantly through a block with the significantcoefficients clustered at the beginning of the block and theinsignificant ones clustered at the end of the block.

The video wavelet pyramid coefficients, by block, are quantized to about0.5 bits/pixel. About 85% of the wavelet coefficients are zero. Thesignificance value of a coefficient is most likely to be thesignificance value of the preceding coefficient. Using this rule, 95% ofthe significance values are correctly predicted. There is an asymmetryin that a significant coefficient preceded by an insignificantcoefficient is much more likely than an insignificant coefficientpreceded by a significant one. An isolated significant coefficientembedded in a (fine subband) run of insignificant ones is much morelikely than an isolated insignificant one in a (coarse subband) ofsignificant ones.

Extending the preceding context to more than just the precedingcoefficient does not qualitatively change the prediction but it doesaffect the probability of the significance of the next coefficient.Preferably, a context of 8 coefficients is useful and allows betterprediction due to vertical adjacency. Because the resulting statisticsappear stable over a wide range of clips, we have done without theadaptation of the probability tables and have used fixed probabilities.

Compression System Embodiment

FIG. 3 is a block diagram of a system 100 for compressing video imagesaccording to one embodiment of the invention. System 100 is forillustrative purposes only; the present invention in general isapplicable for encoding/decoding a wide variety of bit streams from manysources. System 100 may be implemented in either hardware software, andits construction will be apparent to those of skill in the art upon areading of the description below.

System 100 includes a transform unit 104 that receives pixelsrepresenting a series of video images. As mentioned earlier, any of awide variety of transforms may be used; a wavelet transform works wellin this embodiment. The coefficients from transform unit 104 are fedinto quantizer 106 which quantizes the coefficients using any suitabletechnique. In this embodiment, coefficients of 18 bits are then fed intoencoding unit 110 for the final step of encoding.

Encoding unit 110 receives the 18-bit coefficients and passes them inparallel to a significance function generator 112 and a zero coefficienteliminator 114. Significance function generator 112 generates a “1” ifany bit in the coefficient is a “1”, and generates a “0” if all bits inthe coefficient are “0”s. This function may be performed by a logical“OR” upon all the bits in a coefficient. The zero coefficient eliminator114 outputs all 18 bits of a coefficient if any bit in the coefficientis a “1”, and outputs nothing if all of the bits are “0”.

The modified Z-coder 116 accepts the stream of bits from generator 112and encodes these bits according to an embodiment of the invention. Themodified Z-coder 116 can accept any suitable stream of bits, and notnecessarily those resulting from video compression. Implementation ofthe modified Z-coder is described below and flowcharts for encoding anddecoding are presented in FIGS. 5 and 6.

Huffman coder 118 receives an 18-bit coefficient from eliminator 114 andencodes the coefficient using the well-known Huffman algorithm. Theoutput from Huffman coder 118 is a variable number of bits percoefficient.

Preferably, Huffman encoding is performed in the following way. Thedistribution of the values of the non-zero coefficients in this videocompression embodiment demonstrates the preponderance of small values.Also, the bits after the first few have little effect on thedistribution and can encode themselves (self-encode). The sign(non-zeros only) also has nearly a 50-50 probability and can efficientlyself-encode. Encoding can therefore be done efficiently by tablelook-up.

We begin by taking the absolute value of the coefficient andself-encoding the sign. We then take the last few bits (e.g., bits 0-7)of the coefficient, test to see if the remainder bits (8-N) are onlyleading zeros, and if so use the last few bits to index into a table E1(2⁸ entries). The table will contain the Huffman code and the number ofbits in the Huffman code. The Huffman codes can be prepared by lumpingall values greater than 255, making coding room for the larger values.

If bits 8-N are not zero but bits 14-N are zero, bits 6-13 are used toindex into another table E2. It will also contain the Huffman code andits length. The codes for this table can be prepared by separating thelumps described in the previous paragraph. Appropriate coding room isleft for even larger values (after emitting the Huffman code for bits6-13, the self-coded bits 0-5 are emitted). This process may be iteratedas required. The sizes of the tables and the number of levels can bevaried in the obvious ways.

In an embodiment where a compressed bit stream is decoded to produce theoriginal stream, Huffman decoding can be performed in the followingmanner. The first step is to input the sign bit. Then the next 8 bitsare used to index into a table D1 (without removing them from the inputstring). There is a high probability that the next Huffman code will bea head of this index, but this is not guaranteed. A flag in the tableindicates which case holds.

In the first, high probability, (terminal entry) case table D1 needs tocontain the decoded bits (8 of them in our example) and the number ofbits in the Huffman code. The indicated number of bits are removed fromthe input string. Table D1 also needs a count of the number ofself-coded bits that follow and these bits are removed from the inputstring and composed with the decoded value and the sign to recover thecoefficient.

In the second case, the table D1 entry contains the location and log₂length (k) of a follow-up table Df_(i). The 8 bits used to index D1 arebut a head of the full Huffman code and are removed from the inputstring. The next k bits are used to index Df_(i). The process isrepeated until a terminal table entry is located. The “k”s may vary fromentry to entry. Optimization of these values will trade off table spacefor execution time.

Bit combinator 120 combines bits output from modified Z-coder 116 andfrom Huffman coder 118. As will be appreciated by those of skill in theart, combinator 120 recombines significance bits from modified Z-coder116 with the corresponding Huffman encoded coefficients from Huffmancoder 118 and outputs the resulting bits as compressed image 122.

A Modified Z-Coder

Optimally, a fast algorithm is preferable that is implementable in asmall amount of silicon area, even at some modest cost in codingefficiency. With only 15% of the coefficients requiring coding of thecoefficient value, speed and efficiency in identifying that minority ofvalues via the significance function is an important problem. Theaverage run of correct prediction of significance values is about 20, soefficient run coding is also important. Additionally, the importance ofthe 3 bits of context and the asymmetry strongly indicates the use of anarithmetic coder.

However, the requirement for a fast algorithm implementable in minimalsilicon area demands that something other than a traditional arithmeticcoder be used. In particular, multiplies are to be avoided as they arevery expensive in silicon area. The chosen algorithm should have a verygood “fast path” for the individual elements of the runs. The fact thatthe significance function has only two values is a specialization nottaken advantage of by arithmetic coders in general, but is recognized bya Z-coder.

The Z-coder, described in The Z-Coder Adaptive Coder, referenced above,can be viewed as a coder for a binary symbol set which approximates anarithmetic coder. As described, it approximates the coding curve by adyadic broken line. This enables a binary coder with a short “fastpath”and without requiring multiplies. These properties make it an attractivecandidate for coding the wavelet coefficient significance function.

Unfortunately, the Z-coder as described in The Z-Coder Adaptive Codermay not perform particularly well. The present invention thus providesbelow a modified Z-coder that performs extremely well for encodingstreams of bits in general, and for use in image compression inparticular. Operation of the modified Z-coder 116 of FIG. 3 will now bedescribed generally, and then in detail with reference to the encodingand decoding flowcharts of FIGS. 5 and 6.

Familiarity with The Z-Coder Adaptive Coder is assumed in the followingdescription. Recall that the preceding context (ctx) predicts withprobability P(ctx)=P(MPS) the next symbol. The bit predicted is referredto as MPS (Most Probable Symbol) whereas the other choice (there areonly two symbols in the set) is referred to as LPS (Least ProbableSymbol). We always have P(ctx)>=½>=1−P(ctx)=P(LPS). In The Z-CoderAdaptive Coder, Δ is given implicitly, as a function of P(LPS), as

P(LPS)=Δ−(Δ+½)log_(e)(Δ+½)−(Δ−½)log_(e)(½)   (1.0)

The graph of P is shown in FIG. 4. We always have 0<Δ≦½ (n. b., suresymbols are eliminated).

As in an arithmetic coder, the code word C is a real binary number witha normalized lower bound A. The split point Z computation is given inequation (1.3).

0≦A≦C<1   (1.1)

A<½  (1.2)

A<Z=A+Δ<1   (1.3)

In other words a split point Z in (A, 1) is determined by theprobability

prob(MPS|context)=1−prob(LPS|context)

C in [A, Z) codes MPS while C in [Z, 1) codes LPS. If we wish to code anMPS we arrange to output code bits so that Z≦C<1; we have A≦C<Z to codean LPS.

On encode we start not knowing any of the bits of C. However, if thelead (i.e., 2⁻¹) bit of Z and of A are identical then we know that thelead bit of C must agree. So we shift (normalize) the binary point ofeverything one place to the right and subsequently ignore bits to theleft of the binary point as A, C and Z must agree there. The normalizingshift ensures that (1.2) holds at the beginning of the next symbol. TheMPS case is relatively easy since the normalizing shifts ensure thatA_(new)=Z_(old)≦C_(new)<1.

A head of C is the code word for a head of the input symbol string. Itis dyadically normalized into the interval [0, ½), becoming the lowerbound A. A becomes renormalized C and the process is repeated for thenext symbol (renormalization being a multiplication by 2).

The LPS case is more delicate. Since the correct C is unknown right ofthe binary point, we perform binary point shifts (bits shifted out of Ago to the code string) until we are assured that the C that appears inthe decoder will be not less than the A that appears in the decoder. Zis shifted in lock step with A. When the integer part of Z exceeds theinteger part of A we are assured that C<Z at the decoder. Continuing toshift until A=0 assures that A<=Z. Taking fractional parts, C is in [A,1), A is in [0, ½) and Z is in (A, 1).

FIG. 5 is a flowchart describing a modified Z-encoder according to oneembodiment of the invention. In step 304 bits are input from anysuitable source into the modified Z-encoder. In step 308 contexttransformation is performed upon the input bit string to convert thebits into an {MPS, LPS} string. Conversion into an MPS (Most ProbableSymbol)/LPS (Least Probable Symbol) string is a step known in the art.In step 312 a real binary number A is calculated as each symbol isinput. For initialization, A is first set equal to zero. A is a binarynumber having both integer and fractional parts; calculation of thevalue A is a standard step in arithmetic coding. A is calculated as eachnew symbol is input; thus, its value is continually increasing as bitsare input to the encoder. Preferably, in this embodiment, the last bitof A is kept equal to 0 so that adjustments of Z as described in TheZ-Coder Adaptive Coder referenced above generate only integral values.

Step 316 begins a loop that processes each input symbol in turn, i.e.,each symbol will be one of an MPS or an LPS. In step 316 the value Z iscalculated as is known in the prior art. In this embodiment, 8 bits (ingeneral, m bits) are kept after the binary point for the calculatedvalues, although other implementations may keep fewer or a greaternumber of bits. Thus, the calculated value for Z will satisfy thecondition:

fract(A)+2^(−m) <Z<1

The following will also hold true:

A<int(A)+Z

Step 320 checks whether the next input symbol is an MPS, if so, thecontrol moves to step 324. If not, then the next input symbol is an LPSand control moves to step 332. In step 324 the fractional part of A isreplaced with the value Z. In step 328 A is normalized. In thisdescription, normalization of A involves first checking if thefractional part of A is greater than or equal to one-half. If so, then Ais replaced by 2A, i.e., A is shifted to the left one binary point. Ifnot, no action is taken. This shifting occurs until the fractional partof A is less than one-half. The result of normalization of A providesthat:

0<=fract(A)<½

Normalization is performed because we know the 2⁻¹ bit of C. It is alsonecessary to keep A from growing without bound. Without normalization, Awill eventually exceed 1.0. After step 328, processing of the MPScondition is done and control returns to step 316 to process the nextinput symbol and to calculate a new Z value.

If, on the other hand, step 320 determines that the input symbol is anLPS, then control moves to step 332. At this point, the below stepsdiverge from the known prior art of the Z-coder. In the known Z-coder,the corresponding steps do not necessarily produce an encoded outputthat can be relied upon to be decoded back into the original bit stream.Advantageously, we have realized a novel technique to process the LPScondition which does produce an encoded output that can be decoded backinto the original bit stream. These steps are presented below.

In step 332 the Z-greater flag is set to false. The Z-greater flag keepstrack of whether Z is greater than A. Step 336 begins a while loop thatends at step 352. In step 340 the Z-greater flag is set to true for thisloop if there is a 0 to the direct right of the binary point in A, andif there is a 1 to the direct right of the binary point in Z. Thisoperation may be performed by replacing Z-greater with the expression:

Z-greater OR (fract(A)<½ AND ˜fract(Z)<½)

In step 344 the value A is replaced with 2A, i.e., A is shifted to theleft. In step 348 Z is replaced with the fractional part of 2Z. In step352 the while loop continues as long as the condition in step 336 holdstrue. When the condition fails, steps 324 and 328 are executed aspreviously described, and control returns to step 316 to process thenext symbol.

As a final termination step, A is shifted by m bits. The final outputcode word C is equal to A.

FIG. 6 is a flowchart describing a modified Z-decoder according to oneembodiment of the invention. In step 402 initialization is performed bysetting A to 0 and by setting code word C equal to the encoded string tobe input such that 0<=C<1. In other words, the binary point is placed infront of the value C so that there is no integer portion.

In step 404 the code word C is input to the modified z-decoder. Step 416begins a loop in which the value Z is calculated as is known in theprior art. In this embodiment, 8 bits (in general, m bits) are keptafter the binary point for the calculated values, although otherimplementations may keep fewer or a greater number of bits. Thus, thecalculated value for Z will satisfy the condition:

fract(A)+2^(−m) <Z<1

The following will also hold true:

A<int(A)+Z

Step 420 checks whether the fractional part of C is greater than orequal to Z, if so, control moves to step 422 in which an MPS bit isoutput from the code word C. If not, then control moves to step 431 inwhich an LPS bit is output from the code word C. Returning now to step424, the fractional part of A is replaced with the value Z. In step 428A is normalized as described above. In step 430 C is replaced with thefractional part of 2C. After step 430, processing of the MPS conditionis done and control returns to step 416 to calculate a new Z value.

If, on the other hand, step 420 determines that the fractional part of Cis less than Z, then control moves to step 431. At this point, the belowsteps diverge from the known prior art of the Z-coder. In the knownZ-coder, the corresponding steps do not necessarily produce a decodedoutput that can be relied upon to represent the original bit stream.Advantageously, we have realized a novel technique to process the LPScondition which does produce a decoded output that does return theoriginal bit stream. These steps are presented below.

In step 432 the Z-greater flag is set to false. The Z-greater flag keepstrack of whether Z is greater than A. Step 436 begins a while loop thatends at step 452. In step 444 the value A is replaced with 2A, i.e., Ais shifted to the left. In step 448 Z is replaced with the fractionalpart of 2Z. In step 452 the while loop continues as long as thecondition in step 436 holds true. When the condition fails, step 456determines whether the last bit in code word C has been processed. Ifnot, control returns to steps 424-430 and a new Z value is eventuallycalculated in step 416. If so, then in step 460 the reverse contexttransformation is performed on the output {MPS, LPS} string to convertinto the original bit stream.

NTSC Video Embodiment

In one embodiment, the modified Z-coder is applied to NTSC waveletvideo. The input is a D2 digitization of NTSC video where the chroma1and chroma2 are quadrature modulated on a 3.58 MHz sub-carrier. Themodified Z-coder uses a composite 2-6 wavelet pyramid described in VeryLow Cost Video Wavelet Codec, K. Kolarov, W. Lynch, SPIE Conference onApplications of Digital Image Processing, Vol. 3808, Denver, July 1999,plus two levels of Haar pyramid in the time direction (4 field GOP). Thedyadic quantization coefficients are powers of 2.

The modified Z-coder was used on several NTSC clips which vary incontent and origin. The first clip is a cable broadcast of an interviewwithout much motion. The second clip is a clean, high quality sequencefrom a laser disk with a panning motion of a fence with vertical barsclose together and motion of cars on the background. The next clip is aDSS (satellite) recording of a basketball game (already MPEG2compressed/decompressed) with lots of motion and detailed crowd andfield. The last clip is a high quality sequence from a laser disk with azooming motion on a bridge with a number of diagonal cables. The size ofthe frames is 720×486 (standard NTSC) in .tga (targa) format.

The probability values that were used are as follows:

-   -   P₀=0.0107696; P₁=0.2924747; P₂=0.5; P₃=0.1588221;    -   P₄=0.2924747; P₅=0.2924747; P₆=0.5; P₇=0.1588221

Three bits of context are used and the subscripts above denote thedifferent contexts. This choice is made because returns diminish after afew bits of context and 95% prediction can be achieved with 3 bits ofcontext. In the results described below, a very crude scheme is used fornon-zero coefficients coding. Only leading zeroes are coded off, thesign and the other bits are coded as themselves. Significance bits inthe interval (0, 9) are coded in 9 bits, those in (8, 14) in 23 bits andthose in (13,19) in 37 bits. Most non-zeros are coded in 9 bits.

For comparison we have used a high quality commercially available MPEG2codec from PixelTools. The MPEG2 was generated using the best possiblesettings for high-quality compression. In this comparison the followingare used: 15 frames in a GOP (group of pictures), 3 frames betweenanchor frames, 29.97 frame rate, 4:2:0 chroma format, medium searchrange double precision DCT prediction, stuffing enabled, motionestimation sub-sampling by one. The sequences were compressed at 1.0 bppand 0.5 bpp.

The modified Z-coder (with identity Huffman tables) is also comparedwith an arithmetic coder described in Very Low Cost Video Wavelet Codec.That algorithm uses a separate arithmetic coder for each bit plane. Thetransform part for both the modified Z-coder and the arithmetic coder isthe same 2-6 wavelet pyramid. This arithmetic coder is on par with MPEG2in a number of sequences in terms of PSNR (signal-to-noiseratio—mean-square error).

The only sequence that MPEG2 achieves statistically better PSNR is thebasketball sequence in which MPEG can take advantage of the significantamount of (expensive) motion estimation characteristic for that method.Also, this sequence is a recording from DSS, i.e. it was already MPEGcompressed and decompressed before being tested with the coders.

Also, perceptually the quality of MPEG2 vs. arithmetic vs. modifiedZ-coder is very similar. For the fence sequence in particular thequality of MPEG2 compressed video deteriorates significantly for lowerbit rates, even though the PSNR is comparable to the modified Z-coder.Even though the basketball sequence presents an advantage for MPEG interms of PSNR, visually the three methods are very comparable.

Alternative Embodiments

The steps presented above for a modified Z-coder may also be optimizedin any suitable fashion. For example, the known “fast path” or “fence”techniques may be used. Also, the calculation of Z may be varied in anadaptive way to produce a binary context coder. Further, other knownpieces of the Z-coder not used above may be added back into thealgorithm. The above technique may be used to encode and decode anysuitable bit stream and not necessarily a bit stream from video imagedata. For example, the present invention may be used to code a bitstream representing text from a book or other similar applications.

Computer System Embodiment

FIGS. 7 and 8 illustrate a computer system 900 suitable for implementingembodiments of the present invention. FIG. 7 shows one possible physicalform of the computer system. Of course, the computer system may havemany physical forms ranging from an integrated circuit, a printedcircuit board and a small handheld device up to a huge super computer.Computer system 900 includes a monitor 902, a display 904, a housing906, a disk drive 908, a keyboard 910 and a mouse 912. Disk 914 is acomputer-readable medium used to transfer data to and from computersystem 900.

FIG. 8 is an example of a block diagram for computer system 900.Attached to system bus 920 are a wide variety of subsystems.Processor(s) 922 (also referred to as central processing units, or CPUs)are coupled to storage devices including memory 924. Memory 924 includesrandom access memory (RAM) and read-only memory (ROM). As is well knownin the art, ROM acts to transfer data and instructions uni-directionallyto the CPU and RAM is used typically to transfer data and instructionsin a bi-directional manner. Both of these types of memories may includeany suitable of the computer-readable media described below. A fixeddisk 926 is also coupled bi-directionally to CPU 922; it providesadditional data storage capacity and may also include any of thecomputer-readable media described below. Fixed disk 926 may be used tostore programs, data and the like and is typically a secondary storagemedium (such as a hard disk) that is slower than primary storage. Itwill be appreciated that the information retained within fixed disk 926,may, in appropriate cases, be incorporated in standard fashion asvirtual memory in memory 924. Removable disk 914 may take the form ofany of the computer-readable media described below.

CPU 922 is also coupled to a variety of input/output devices such asdisplay 904, keyboard 910, mouse 912 and speakers 930. In general, aninput/output device may be any of: video displays, track balls, mice,keyboards, microphones, touch-sensitive displays, transducer cardreaders, magnetic or paper tape readers, tablets, styluses, voice orhandwriting recognizers, biometrics readers, or other computers. CPU 922optionally may be coupled to another computer or telecommunicationsnetwork using network interface 940. With such a network interface, itis contemplated that the CPU might receive information from the network,or might output information to the network in the course of performingthe above-described method steps. Furthermore, method embodiments of thepresent invention may execute solely upon CPU 922 or may execute over anetwork such as the Internet in conjunction with a remote CPU thatshares a portion of the processing.

In addition, embodiments of the present invention further relate tocomputer storage products with a computer-readable medium that havecomputer code thereon for performing various computer-implementedoperations. The media and computer code may be those specially designedand constructed for the purposes of the present invention, or they maybe of the kind well known and available to those having skill in thecomputer software arts. Examples of computer-readable media include, butare not limited to: magnetic media such as hard disks, floppy disks, andmagnetic tape; optical media such as CD-ROMs and holographic devices;magneto-optical media such as floptical disks; and hardware devices thatare specially configured to store and execute program code, such asapplication-specific integrated circuits (ASICs), programmable logicdevices (PLDs) and ROM and RAM devices. Examples of computer codeinclude machine code, such as produced by a compiler, and filescontaining higher level code that are executed by a computer using aninterpreter.

Although the foregoing invention has been described in some detail forpurposes of clarity of understanding, it will be apparent that certainchanges may be practiced within the scope of the appended claims.Therefore, the described embodiments should be taken as illustrative andnot restrictive, and the invention should not be limited to the detailsgiven herein but should be defined by the following claims and theirfull scope of equivalents.

1.-17. (canceled)
 18. An encoding unit for encoding a stream of bits,the encoding unit comprising: a significance function generator thatreceives coefficients and outputs significance bits for thecoefficients; a zero coefficient eliminator that receives thecoefficients and outputs the coefficients when the coefficients arenon-zero; a Huffman coder that receives the coefficients from the zerocoefficient eliminator and outputs a Huffman-encoded string; means forperforming the function of encoding the significance bits using amodified Z-coder such that the encoded significance bits may be decodedusing the modified Z-coder to reproduce substantially the significancebits; and a combination unit that combines the Huffman-encoded stringwith the encoded significance bits to produce an encoded form of thestream of bits, wherein the stream of bits is associated with videodata.
 19. An encoding unit as recited in claim 18 used for compressing avideo image, the encoding unit further comprising: a wavelet transformunit that transforms pixels from an image to produce the coefficients.20. An encoding method for encoding a stream of bits, comprising:generating a significance function that receives coefficients andoutputs significance bits for the coefficients; generating a zerocoefficient eliminator that receives the coefficients and outputs thecoefficients when the coefficients are non-zero; encoding thecoefficients using a Huffman coder and outputting a Huffman-encodingstring; wherein the Huffman coder receives the coefficients from thezero coefficient eliminator; encoding the significance bits using amodified Z-coder such that the encoded significance bits may be decodedto reproduce substantially the significance bits; and combining theHuffman-encoded string with the encoded significance bits to produce anencoded form of the stream of bits, wherein encoding the significancebits using a modified Z-coder comprises performing a binary point shifton a variable A and performing a binary point shift on a variable Z, andwherein variable A represents a normalized lower bound on a code wordand variable Z represents a split point.
 21. An encoding method asrecited in claim 20, used for compressing a video image, the encodingmethod further comprising: performing a wavelet transformation on pixelsfrom an image to produce the coefficients.
 22. A system for encoding astream of bits comprising: a significance function generator thatreceives coefficients and outputs significance bits for thecoefficients; a zero coefficient eliminator that receives thecoefficients and outputs the coefficients when the coefficients arenon-zero; a Huffman coder that receives the coefficients from the zerocoefficient eliminator and outputs a Huffman-encoded string; a modifiedZ-coder that encodes the significance bits such that the encodedsignificance bits may be decoded to reproduce substantially thesignificance bits; and a combination unit that combines theHuffman-encoded string with the encoded significance bits to produce anencoded form of the stream of bits, and wherein for a given coefficient,the significance function generator outputs a ‘1’ if any bit in thegiven coefficient is non-zero and outputs a ‘0’ otherwise.
 23. A systemas recited in claim 22, used for compressing a video image, the systemfurther comprising: a wavelet transform unit that transforms pixels froman image to produce the coefficients.